Adding Word Duration Information to Bigram Language Models

نویسندگان

  • George Doddington
  • Aravind Ganapathiraju
  • Joe Picone
  • Yufeng Wu
چکیده

Suprasegmental information, while generally thought to play an important role in speech recognition by human listeners, has shown little promise in previous attempts to integrate into ASR systems. This paper outlines an approach that will successfully exploit suprasegmental information by modeling duration within the context of N-gram language modeling. Results show that up to half of the variance in wordlevel timing can be explained in terms of a simple bigram duration model. These experiments were conducted using the Switchboard corpus of conversational speech over the telephone. The paper also outlines a way of augmenting the N-gram language model with suprasegmental information.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating Large Context Language Models into a Real Time Word Recognizer

In this paper we present a new recognizer architecture that allows the eecient integration of language models with arbitrary large context information, e.g. polygram models, into the recognition process. Instead of using these models for rescoring the n best word chains generated using bigram information, we extract the best word chain, or optionally the n best word chains, directly from the wo...

متن کامل

Interpolated Distanced Bigram Language Models for Robust Word Clustering

Two methods for interpolating the distanced bigram language model are examined which take into account pairs of words that appear at varying distances within a context. The language models under study yield a lower perplexity than the baseline bigram model. A word clustering algorithm based on mutual information with robust estimates of the mean vector and the covariance matrix is employed in t...

متن کامل

Word Pairs in Language Modeling for Information Retrieval

Previous language modeling approaches to information retrieval have focused primarily on single terms. The use of bigram models has been studied, but the restriction on word order and adjacency may not be justified for information retrieval. We propose a new language modeling approach to information retrieval that incorporates lexical affinities, or pairs of words that occur near each other, wi...

متن کامل

Using a stochastic context-free grammar as a language model for speech recognition

This paper describes a number of experiments in adding new grammatical knowledge to the Berkeley Restaurant Project (BeRP), our medium-vocabulary (1300 word), speaker-independent, spontaneous continuous-speech understanding system (Jurafsky et al. 1994). We describe an algorithm for using a probabilistic Earley parser and a stochastic context-free grammar (SCFG) to generate word transition prob...

متن کامل

Subword lexical modelling for speech recognition

In this work, we introduce and develop a novel framework, angie, for modelling subword lexical phenomena in speech recognition. Our framework provides a exible and powerful mechanism for capturing morphology, syllabi cation, phonology, and other subword e ects in a hierarchical manner which maximizes sharing of subword structures. Angie models the subword structure within a context-free grammar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999